Citation-Based Document Categorization: An Approach Using Artificial Neural Networks
نویسندگان
چکیده
The automatic organization of large collections of documents becomes more important with the growth of the amount of information available in digital form. This study contributes to this issue evaluating the use of Artificial Neural Networks (ANNs) to automatically categorize documents through the analysis of the references cited in these documents. The article describes the method developed to generate clusters of documents based on bibliometric concepts. The method is grounded on the premise that the presence of common citations is indicative of relationships among documents and thus publications are categorized using citations as the main input information. ANNs are typically used to solve problems related to approximation, prediction, classification, categorization and optimization. Many of the experiments reported in the literature describe the use of SOM networks, Self Organizing Maps, in the organization of documents for information retrieval. SOM networks are used in this work in order to categorize documents in a test database. In this categorization process, the semantic relationships among documents are defined not by the identification of terms in common, but by the presence of common cited references and their years of publication. After validation of the method, through the use of a prototype, a database was created, containing the references cited in 200 articles published in the IEEE Transactions on Neural Networks Journal, between years of 2001 and 2010. The publications were categorized by the ANN and presented in groups organized by their common citations. The results obtained show that the ANN successfully identified clusters of authors and texts, through their cited references. These clusters, formed through automatic classification of documents, evidence the existence of semantic relationships between the documents. They can be useful, for example, to automatically identify groups of researchers working in related fields or for identifying research trends in specific domains of knowledge. Another application would be in the process of information retrieval, where they could assist users in the development or reformulation of their queries. Magali Rezende Gouvêa Meireles and Beatriz Valadares Cendón 72
منابع مشابه
An Approach of Artificial Neural Networks Modeling Based on Fuzzy Regression for Forecasting Purposes
In this paper, a new approach of modeling for Artificial Neural Networks (ANNs) models based on the concepts of fuzzy regression is proposed. For this purpose, we reformulated ANN model as a fuzzy nonlinear regression model while it has advantages of both fuzzy regression and ANN models. Hence, it can be applied to uncertain, ambiguous, or complex environments due to its flexibility for forecas...
متن کاملMonthly runoff forecasting by means of artificial neural networks (ANNs)
Over the last decade or so, artificial neural networks (ANNs) have become one of the most promising tools formodelling hydrological processes such as rainfall runoff processes. However, the employment of a single model doesnot seem to be an appropriate approach for modelling such a complex, nonlinear, and discontinuous process thatvaries in space and time. For this reason, this study aims at de...
متن کاملAn artificial Neural Network approach to monitor and diagnose multi-attribute quality control processes
One of the existing problems of multi-attribute process monitoring is the occurrence of high number of false alarms (Type I error). Another problem is an increase in the probability of not detecting defects when the process is monitored by a set of independent uni-attribute control charts. In this paper, we address both of these problems and consider monitoring correlated multi-attributes proce...
متن کاملPrediction of monthly rainfall using artificial neural network mixture approach, Case Study: Torbat-e Heydariyeh
Rainfall is one of the most important elements of water cycle used in evaluating climate conditions of each region. Long-term forecast of rainfall for arid and semi-arid regions is very important for managing and planning of water resources. To forecast appropriately, accurate data regarding humidity, temperature, pressure, wind speed etc. is required.This article is analytical and its database...
متن کاملAN INTELLIGENT FAULT DIAGNOSIS APPROACH FOR GEARS AND BEARINGS BASED ON WAVELET TRANSFORM AS A PREPROCESSOR AND ARTIFICIAL NEURAL NETWORKS
In this paper, a fault diagnosis system based on discrete wavelet transform (DWT) and artificial neural networks (ANNs) is designed to diagnose different types of fault in gears and bearings. DWT is an advanced signal-processing technique for fault detection and identification. Five features of wavelet transform RMS, crest factor, kurtosis, standard deviation and skewness of discrete wavelet co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015